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MAJORITY BOOTSTRAP PERCOLATION ON G(n,p) 


CECILIA HOLMGREN^, TOMAS JUSKEVICIUS 1 , AND NATHAN KETTLE 1 


Abstract. Majority bootstrap percolation on a graph G is an epidemic 
process defined in the following manner. Firstly, an initially infected set 
of vertices is selected. Then step by step the vertices that have more 
infected than non-infected neighbours are infected. We say that perco¬ 
lation occurs if eventually all vertices in G become infected. 

In this paper we study majority bootstrap percolation on the Erdos- 
Renyi random graph G(n,p ) above the connectivity threshold. Perhaps 
surprisingly, the results obtained for small p are comparable to the re¬ 
sults for the hypercube obtained by Balogh, Bollobas and Morris [2]. 


1. Introduction 


The classical bootstrap percolation, called r-neighbour bootstrap per¬ 
colation, concerns a deterministic process on a graph. Firstly, a subset of 
the vertices of a graph G is initially infected. Then at each time step the 
infection spreads to any vertex with at least r infected neighbours. This pro¬ 
cess is a cellular automaton, of the type first introduced by von Neumann 
in [13]. This particular model was introduced by Chalupa, Leith and Reich 
in [6], where G was taken to be the Bethe lattice. 

A standard way of choosing the initially infected vertices is to inde¬ 
pendently infect each vertex with probability p. The probability that the 
entire graph eventually becomes infected is increasing with p. It is therefore 
sensible to study the quantity p c = inf{p : P p (G infected) > c}, in particular 
the critical probability p \/2 and the size of the critical window p±- e — p e . 

A natural setting for this problem is the finite grid [n] d . Many of the 
results on bootstrap percolation concern this problem. The first to study this 
graph were Aizenman and Lebowitz in [T], who showed that in 2-neighbour 
bootstrap percolation when d is fixed we have p - L / 2 = ©((logn) 1_d ). 

The r-neighbour bootstrap percolation process has also been stud¬ 
ied on the random regular graph by Balogh in |3J and on the Erdos-Renyi 
random graph G(n,p ) by Janson, Luczak, Turova and Vallier in [8]. 

In majority bootstrap percolation a vertex becomes infected if a ma¬ 
jority of its neighbours are. In [2] Balogh, Bollobas and Morris studied 
this process on the hypercube and showed that if the vertices of the n- 
dimensional hypercube are independently infected with probability 
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then, with high probability, percolation occurs (i.e., all vertices eventually 
become infected) if A > | and does not occur if A < — 2. 

In this paper we shall study majority bootstrap percolation on the 
Erdos-Renyi random graph G(n,p ) above the connectivity threshold. We 
will see that for small p our results are in fact comparable to the results 
for the hypercube in [2], noting that the degree for each vertex in the n- 
dimensional hypercube (with 2 n vertices) is equal to n. 

2. Main Results 

In this section we shall state our main results and discuss two different 
ways of selecting the initially infected set. The proofs of these theorems (in 
Section [3] and Section 0|) use inequalities that are described separately in 
Section [5j 

For a graph G with some subset Iq C V{G ) of initially infected ver¬ 
tices, the majority bootstrap process on G is defined by setting h+i — 
It U {u € E( G ) : |It H I»| > }, where T(u) is the neighbourhood of v. 

For a finite graph G, this process will terminate with It+i = It- Denote by 
/ = It the set of eventually infected vertices. 

We shall look at the case of G = G(n,p), the graph on n vertices, 
where each edge is included independently with probability p. Often p := 
pin) -A oo as n —5- oo, but we use the standard notation to just write p also 
for functions depending on n. Our initial setup is slightly different than for 
the hypercube mentioned above, instead of infecting each vertex indepen¬ 
dently with some probability q , we shall infect a random set of vertices of 
size m := m[n). 

In the normal setup for the majority bootstrap process on G(n,p), we 
would first choose the edges of G(n,p), and then choose an initially infected 
set /o uniformly from [n]i m \ As these two choices are independent we shall 
equivalently set Iq = [m], and then choose the edges of G(n,p). This is the 
MB(n,p ;m) process. 

We now introduce some notation that shall be used. We use the 
standard asymptotic little-o notation and this is always taken as n or N 
tends to infinity, i.e., if (b n ) is a sequence of numbers, we say that b n = o(a n ) 
if b n /a n -A 0, as n -A oo.We set d = thus d is roughly the average degree 
in G(n,p ) for p = o(l). We denote the binomial distribution with parameters 
n and p by B(n,p). We shall sometimes abuse the notation and denote by 
B{n,p) a random variable that has a binomial distribution. We reserve m 
for the size of Iq and shall always assume that 

n n /log d log log log d f log log log d\ 

2 2 \ d sjd log d V Vd log d ) 

for some constant A. We also use the standard notation that an event E n 
holds with high probability , i.e., for the event E n it holds that P (E n ) -a 1, as 
n -A oo. Let u(n) denote some arbitrary positive function that is increasing 
and unbounded, as n tends to infinity. 

The inequalities below are only claimed to be true for n large enough. 
For the MB(n,p ;m) process, define 

V m (G(n,p)) =P(/= [n]). 
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We shall now state the main result of this paper. 

Theorem 1. Fix some number e > 0. Assume that for n large enough, 


(1 + e) log n < p( 1 — p)n. 
If the initially infected set /q has size 



n /log d log log log d { log log log d\ 

2 V~ + A " Vrflog d + < T JdHd )' 


then 


V m (G(n,p)) 



if \ >\, 
if A < 0. 


Our second result concerns a more natural setup, where each vertex 
is initially independently infected with probability q, we have that, with 
high probability, ||/o| — qn\ < tj(n)y/q(l — q)n. When yfn <C n ^ ~i 

i.e, when p <C ^ Iogl ^ 1 ° gn ^ , our result above shall still hold in this setting 
for q = m/n. 

More formally define the MB' (n,p ; q ) to be the process in which the 
graph G(n,p ) is chosen, and each vertex is initially infected independently 
with probability q. Then the infection spreads by the majority bootstrap 
percolation process. For the process MB’(n,p ; q) define 


V' q (G(n,p))=¥(I = [n}). 

Corollary 2. Fix some number e > 0. Assume that for n large enough, 


(1 + e) log n < p( 1 — p)n. 


if p < (log S gw)2 , then with q 


1 

2 




V' q (G(n,p)) 



if A > 
if A < 0. 


Up > (logl i 0 ogn Sn)2 > then with « = \ ~ + e i^’ we have 


V'(G(n,p))^H 29), 

where <h(x) denotes the distribution function of the standard Normal random 
variable. 


Proof. As each vertex is infected independently, \Iq\ has distribution B{n , q). 
Thus, with high probability, it holds that ||/o| — qn\ < ui(n)y/q(l — q)n. 
If p <C P ogl ° glogre ^ then n log / 1 ?f log - -Jn and the result follows from 

r log n ? y/d log a v 

Theorem [1] 
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If p ( logloglogn ) ; then for each fixed 5 > 0 by the Central Limit 
Theorem we obtain 

n 

V' q (G(n,p)) = (B(n,q) = m)V m (G(n,p)) 

m =0 

> P (B(n, q)>qn+(S- 0)y/n) V [qn+{5 -e)^i ( G (. n >P)) 

= P (b(ti, q)/y/q(l-q)n > (qn + (<S - 9)y/n)/y/q (1 - q)nj (1 + o(l)) 
-+*(2(0-6)), 

where the fourth line follows as V^ qn+ ^-9)^\ ( G (n,p)) -+ 1 for p 2> ^° S 1 ^ 1 ° sn ' ) 
by Theorem |TJ A similar argument shows that 

1 - V q (G(n,p)) > *(-2(0 + e))(l + o(l)), 

and so 

V q (G(n,p))^<*(20). 

□ 

When p is smaller than the connectivity threshold, G(n,p) contains 
isolated vertices. Due to the way we define the MB(n,p ;m) process, any 
uninfected isolated vertex becomes infected in the first time step, so this is 
not an obstruction to complete percolation. However, once p drops to below 
then, with high probability, G(n,p) contains isolated edges and neither 
endpoint of an isolated edge becomes infected if both endpoints are initially 
uninfected. This means that V m (G(n,p )) —» 0 unless m = n — o(n). 

Remark 3. Preliminary versions of this paper (including the same results) 
were included in the Phd thesis by Kettle CD and in the PhD thesis by 
Juskevicius j9]. There is also a recent study by Stefansson and Vallier |T2j on 
this subject using completely different methods than those that are used in 
this paper (but using similar methods as was used by Janson, Luczak, Turova 
and Vallier in El), where they show the first asymptotics of the thresholds 
m ~ 7^ in Theorem [T] above, and similarly thus the first asymptotics of the 
threshold q ~ ^ in Corollary [2] above. 

3. Upper Bound 

As G is finite the MB(n,p ; m) process will eventually terminate with 
some set I C [n] of infected vertices. If we do not infect the whole graph, 
or, equivalently, we have that I ^ [n], then we can say something about 
the structure of I. We shall call a proper subset S of [n] closed if for all 
v E [n] \ S we have |T(u) n S\ < . Recall that Jo is the set of initial 

infected vertices and that a vertex v € It+i, if either v € It or if at least 
half of its neighbours lies in R. In particular It C h+i- If the majority 
bootstrap process does not percolate, let T be such that the process has 
stabilized, i.e., I = It = It+ i / [ n ] ■ Then I is a closed set, and thus we 
must have that the initially infected vertices Jo is a subset of a closed set. 

We shall show that, if A > b then, with high probability, Iq is contained in 
no closed sets in three stages. Using Lemma [5] will allow us that, with high 
probability, the graph G(n,p) has no ’’large” closed sets. After that we shall 
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bound the expected number of medium sized closed sets that Iq is contained 
in, hence by the Markov inequality it will follow that, with high probability, 
there are no medium sized closed sets containing Iq. But before we proceed 
with proving these two facts, we shall show that, with high probability, the 
number of infected vertices after one time step, |/i|, is large, and so Jo can 
rarely be contained in a small closed set. Recall that 

n n jlogd log log log d ( log log log d\ 
m =- \ —--b An- - -b o n - - . 

2 2 \ d y/d log d V \fd\ogd ) 

We assume that for some fixed e > 0 it holds that for n large enough 
(1 + e) log n < p(l — p)n. However, for some of the results below it is enough 
to assume that p( 1 — p)n = u(n). 


Lemma 4. In the MB(n,p ;m) process, 


with high probability. 


\h\h\ > 


n( log log d) 2X 
e 8 y/d log d 


Proof. For i € [n] \ Iq, denote by Ai the event that vertex i is infected at 
time one, that is the event that i has fewer neighbours in [n] \ Iq than it 
does in Iq. The events Ai are identical and very weakly correlated but not 
independent. Let X be the number of vertices infected at the first step of 
the process. Then X = |/i \ I 0 \ = ^1(A:)- We shall use Chebyshev’s 
inequality to bound the probability that X is small. 

As the events A t are identical we shall set r = P(A*), so E(X) = 
(n — m)r. Let B(m,p ) and B(n — m — 1,(1— p)) be independent random 
variables with means p\ and p- 2 , respectively. We have that 

r = P(|T(*)n/o| >r(i)n([n]\/ 0 )) 

= P ( B(m,p ) > B(n — m — 1 ,p)) 

= P ( B(m,p ) + B(n—m— 1, (1— p)) > pi + P 2 + p{n—2m—l)) . 


For p —, we have p{n — 2m— 1) = ui(n)y/p( 1 — p)n and p(n — 2m — 
l) 2 = o(nyJp{l — p)n). Applying the bound from Proposition I2T1 to the last 
equality with N = S = n ~ 1 ~ 2m an d h = p(n — 2m — 1), we obtain 


\A>(1 ~P)(n ~ 1) ( 

2irp(n — 2m — 1) 6 ^ \ 


p(n — 2m — l) 2 
2(l-p)(n-l) 


-4-o(l) 


> 


> 


2^p eXP ( 

( (loglogd) 2A 
\2ne^y/d log d 


— -b 2A log log log d 

(1 + 0 ( 1 )), 


4 + o(l) 


where in the second line we have used the asymptotic relation 
d(n — 2m — 1 ) 2 = n 2 log d — 4An 2 log log log d + o(n 2 ). 


(1) 


Let us calculate the variance of X. Let 


r' = ¥(A j \A i ) - r. 
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this being the same for any i 7 ^ j. We have 

Var(X)= Y, (P(^IA)-P(^))P(^) 
j,je[n]\[m] 

= (1 — r)r(n — m) + r'r(n — m)(n — m — 1 ), ( 2 ) 

where the first term in ©is the sum over i = j and the second term is the 
sum over i / j. Let B t j and B 7J be the events that ij is, or is not, an edge 
in G respectively. Note that 

¥(Aj\Bjj) = P (B(m,p) > B(n — m — 2,p) + 1) (3) 


and 


P (Aj\Bij) = P (B(m,p) > B{n — m — 2 ,p)), (4) 

where B(m,p ) and B{n — m 2 ,p) are independent random variables. Note 
that F(Aj\Bij) < P (. Aj\Bij ), hence we may bound r' by 

r' = FiAjlAJ-FiAj) = P(^|B ij )P(^l^)+P P (5y |A) -P (A,-) 

<F(A j \B ij ) -F(Aj) 

= F(A j \B ij )(l-(l-p)) 

= P { p ( A j\ B ij)) 

= pF (. B(m,p ) = B(n — m — 2 ,p)) , 


where the last equality follows from @ and ©). 

As p( ^ — m — 1) = uj{n)yjp{ 1 — p)n (which is true for p P) we get 
from Proposition [23] applied with (N. S, T) = (| — 1, | — m — 1, 0), that r' 
is at most 


p(j — m — 1 ) 

27r(l — p)(§ — 1) 6XP 


+ 

< 


vr(| — m — 1) 
py/ log d 


exp 


27r(l — p)yfd 


exp 


p(f -m- l ) 2 
(l-p)(§-l) 


+ o(l) 


9jj(| — 771 — 1) 2 \ 

~8 (l-p)(f-l)J 

+ 2A log log log d + o(l) 


+ 


6 y/d 

7rn-v/log d 



91ogd 
16 + 


9 A log log log d \ 

- 4 - 


The second term is much smaller than the first term, and so (for n 
large enough) 


i/logd(log log d) 2X 
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We are now able to bound the probability that X is small. From ([2j) 
and Chebyshev’s inequality we get 


X < 


(n — m)r 


< 


X — (n — m)r\ > 


(n — m)r 


4Var(X) 

((n — m)r ) 2 

4((1 — r) + (n — m — 1 )r') 
(n — m)r 


4 r' 

< — + o 
r 

From Q and (pQ) this is at most 


(!)• 


^XA {1+ „(!)) + <,(!)=„(!), 

n I 

and so we have, with high probability, that |/i \ 7 q| is at least — ,,"^ f . By 
using m we get that for large n, 

(n — m)r n(log log d) 2X 
2 — e s yfd\ogd ’ 

which completes the proof. □ 


We now show that G(n,p ) contains no large closed sets by a simple 
edge set comparison. 


Lemma 5. Suppose that for some fixed e > 0 we have 

p( 1 — p)n > (1 + e) log n, 

for n large enough. Then, with high probability, G(n,p) contains no closed 
set of size greater than ^ 

Proof. Let us write s for the size of the set S i.e., s =| S \. In order for 
the set S to be closed, each vertex v € [n] \ S has to have the majority of 
its neighbours outside S. In other words, we must have |r(w) n ([n] \ S')! > 
|r(u) n S\. Summing over the vertices in [n] \ S, we have that the number 
of edges from S to [n] \ S must be fewer than twice the number of edges in 

i n ]\ s - 

If ^ < s < Hp, then p(2s — n) > 7y/p(l — p)n, and so 

ps(n — s) — 3 (n — s)s/p( 1 — p)s >22 ) ~ s )v / 4 '(i- — p){n — s). 

By Proposition 1241 every set of size n — s has at most 

P { n 2 S ) + 2( - n _ S )v / P ( 1 s) 

edges with probability at least 1 — and by Proposition \To\ every set 5 
of size s has at least 


ps(n — s) — 3(n — s) \/p(l — p)s 
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edges between it and its complement with probability at least 1 — 
Therefore, with high probability, every set S of size 

n 7 n 4n 

2 + 17i <s< ~ 

is not closed. 

If s > -;p and p( 1 — p)n > 41ogn, then we know from Proposition [26] 
that with probability at least 1 — n~"^o there does not exist a closed set of 
size s in G(n,p). The result follows as ^*>i = o(l). 

If n — n 28 > s > ^ and 5 log n > p(l — p)n > (1 + e) log n, then we 

know from Corollary [27] that with probability at least 1 — there does 

not exist a closed set of size s in G(n,p). 

27 

If s > n — n 28 and 51ogn > p( 1 — p)n > (1 + e)logn, then we 
know from Proposition [29] that with probability at least 1 — n ™ every set 
: = [n]\S of size n — s has at most 2 (n — s) edges, and so has a vertex v s c 
of degree at most 4. By Proposition [28] we have that, with high probability, 
the minimum degree of G{n,p) is at least 9, and so Vgc will become infected 
if all of S is infected, and so S is not closed. 

□ 


Lastly, we turn to bounding the expected number of medium sized 
closed sets Iq is contained in. We shall therefore want a bound on the prob¬ 
ability that a set S of size at least s in a particular range of s is closed. To 
do this we shall pick a test set T of a suitable size and bound the probability 
that none of the vertices in T are infected by S. 


Lemma 6. Fix e > 0 and define 

n\/log d 


s = 


+ 


n(log log d) 


l+e 


2 \fd \/d log d 

Take any set of vertices S in G(n,p ) of size s < 
large enough, 

n(log c7) I log log rf ) e — 2 


< . Then for n 


P(5 is closed ) < exp — - 


e 7 y/d 


Proof. Let S be a set of vertices such that s <\ S |< 


T C V (G) \ S of size t = |j 


2 n 
— 3 • 


Consider a set 


(log df 1 \ ' S I 1& H condition on the edge set of T 

as once we have done so the events F v , that v is not infected by S for each 
vertex v € T, are independent. 

Denote by E = E(T) the edge set of T, and set d_E;(u) to be the degree 
of vertex v € T, when T has edge set E. We have that 

nFv\E) = P(|I» n s\ < d E {v) + |r(v) n ([n] \ (5 u t))|). 

Therefore, 


P (S is closed) < (E) P(F V \E) 

E V eT 

= P(^) n p (^(l S Ip) < B ( n - l-SI -t,p) + d E (v)), 

E v£T 
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where F(E) is the probability of a particular edge set E C {0.1}(a) and is 
equal to p\ E \ 1 — p)( 2 )”1^1. 

The function/| 5 |(x) = P(L>(|S|,p) < B(n—\S\—t,p)+x) (for indepen¬ 
dent binomial random variables S(|5'|,p) and B(n—\S\—t,p)) is decreasing 
in |5|, so we have f s (x) > f\s\(x)- Let us supress the dependency on s by 
writing f{x) instead of f s (x). We have 


F(S is closed) < P(£?) n (6) 

E dGT 


The rest of the proof shall be spent bounding The degree of 
vertices in T is heavily concentrated around pt , and we shall expand / 
around pt to show that ([6]) is not much larger than /(pt)*. 

We have by Corollary [13] that / is log-concave, and so for any x and 
V with f(y) / 0, 


/(®) < f{y) 


( f(y + ]) 
V f(y) 


x-y 


Setting y = \pt\ E N we get 


/ f(o, 1 11 \ d E{v)-y 

nS is closed) < £>(£) H f(y) [ 




There is no dependence on E other than its size, and so 

ns is dosed) < g (®) P ‘(i 


= 1 —p + p 


f(y + jO 

f(y) 


(2) 


f(y) 

f{y + 1) 


ty 


m* • ( 7 ) 


Setting = \-\- a , we bound j7]) using the inequalities 1 + w < e w 


f(y) t 

and (1 + x) -1 < 1 — x + x 2 for x > 0 to get 

n £ / 1 

P(5 is closed) < (l + 2 ap + a 2 p ) 2 


1 “b CL 


pt z 


m* 


< exp ( (2 ap + a 2 p)— + (a 2 - a)pt 2 ) f(yf 


= exp 


3 pa?t 2 


fivY- 


( 8 ) 


We have that 

f(y + 1) = f(y) + IP (B(s,p) = B(n — s — t,p) + y). 

Let us write z = P ( B(s,p) = B(n — s — t,p) + y) to ease up the nota¬ 
tion. Thus, f(y + 1) = f(y) + z. By Proposition 1231 applied with N = n ~* +T , 
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S = 11 % t+T and T = ^1 and noting that 0 < T — t < p \ we have 


n - 2s + \ 
z < -——- r~ exp 


+ 


27r(l — p)n 
6 


exp 


2p(§ ~ *) 2 
(1 -p){n-t) 

9 p(n — 2.s) 2 


+ o(l) 


7T (n — 2s) 

_ V^ogd 
2-7r(l — p)\fd 
6 \[d 

+-7i=^ ex P 

7rnvlog « 


16(1 -p)(n + |)y 
exp f(-^ + 2(log logd) 1+e )(l + ^) + o(l)^ 
91ogd 9(loglogd) 1+e 


16 


+ 


+ o(l) 


(9) 


The second term in @ is much smaller than the first so as 6 < 27r 
and t log d = o(n) we get (for n large enough) 


2 < 


A/Iogd (log d) 2 9°s lo § d ) e 


6(1 — p)d 

We can rewrite f(y) as 

f(y) = 1 — P ( B(s,p ) + B(n — s — t, (1 — p)) > n — s — t + y). 
We have for p^$> - the asymptotic relation 


{p(n — 2s) + 1 )(t + 2 s — n) = o(n\/ np( 1 — p)) 
-,S,h) = ( n ~ t n ~ 2s ~ t 

we obtain (for n large enough) that 

(p(n - 2s) + l) 2 


and so using Proposition I2T1 with (N,S,h) = (Ap, n iy t ,p(n— 2s)+ y— pt) 


tt w n \/p(l -p)(n-t) 

/(y) < 1 - exp 


< 1 - 


< exp — 


2 ir(p(n — 2s) + 1) 
(log <j) 2 9°g lo §) e 
e 6 \/^ logd 
(log d)d°g lo g d ) e 


2p(l — p){n — t) 


-4-o(l) 




( 10 ) 


the second inequality follows from the same reasoning used in ® and that 
e 6 > 27re 4 . 

We can also apply Proposition [22] to get a lower bound on f(y) (for 
n large enough) of 


/fa)>1 ~ p(n — 2s) eXP t 


p(n — 2s) 2 . A 1 

2(l- f ,)(n-() +3 + ° (1) ) > 2' 


here the bound on 1 — f(y) is actually o(l), being within a constant factor 
of the bound in (fTOl) . 

We are now able to get a good upper bound on a, 

^ \/Iogd(log cf) 2 ( lo s lo g d ) e 
a = r < — 


f(y) 


3(1 — p)d 
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Substituting these bounds into (JHJ) we get (for n large enough) 
( p(logd) 4(loglogd ) e n (log rf)( loglogd ) e \ * 


P(5 is closed) < exp 


l 6(1 — p) 2 d 2 logd 


^Vd 


The second term in the exponential is much larger than the first term, 
and so (for n large enough) 

( (logd)( loglog,i ) E V 

P(S is closed) < exp ( — - * 


< exp — 


2 e 6 \fd ) 
n(log d)(iogi°g^) e -2' 
e 7 Vd 


as t > „ n 2 !!^ . 


□ 


e(log d) 

We shall now bound the expected number of closed sets in this medium 
sized range that contain Iq, this is also a bound on the probability that /o 
is contained in such a medium sized closed set. 


Proposition 7. 


Assume that 


m = - 

2 


n rty/ log d n\ log log log d 


+ 


+ o n 


log log log d \ 
Vd log d ) ’ 


2 Vd y/dAogd 

and choose some e > 0. Then the expected number of closed sets in G(n,p) 
of size between 


n ny/Togd n(loglogd) 1+e 
2 2 Vd \Jd log d 


n 4 n 

ani 2 + Td 


that contain Iq is o(l). 

Proof. Let S’ be a set of size s in our range, s can have at most |_ rav ^l grf 

different values. For each possible value of s and n large enough (using 
Stirling’s formula) there are at most 


n — m 
s — m 


< 


n 

I nyiog d I 
L Vd J 


< 


s Vd 


n\/log d 
\fd 


< exp 


n(log d ): 

Vd 


yiog d j 

possible closed sets that can contain Iq. By Lemma [6] the expected number 
of closed sets is (for n large enough) less than 


nV^ogd f n(log<f )2 n(logd)( loglogd ^ 2 


Vd 


■ exp 


Vd 


7 Vd 


and this is o(l) as (loglogd) 6 is unbounded. □ 

Corollary 8. Fix some number e > 0. Assume that for n large enough, 

(1 + e) log n < p( 1 — p)n. 

If the initially infected set Iq has size 

n \ogd log log log d f log log log d\ 


n 

m =- 

2 2 


d 


+ A n- 


Vd log d 


+ o n 


Vd log d J ' 


then for A > V with high probability, the MB(n,p ;m) process percolates. 
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Proof. We have from Lemma [H that, with high probability, Iq = [to] is 
contained in no closed set of size less than 

n n\J log d n(loglogd) 2A 

2 2 \fd + e 8 \/d logd 

Using the Markov inequality it follows from Proposition [7] applied to e = 
A — that, with high probability, /o is contained in no closed set of size 
between 


We have from Lemma [5] that, with high probability, /o is contained 
in no closed set of size greater than 

n 7n 
2 + ^’ 


n n\J log d n (log log d) 
2 2 \fd Vd log d 


A+i 


and 


n 4 n 

2 


and so, for A > with high probability, Iq is not contained in any closed 
set in G(n,p) and hence percolates. □ 


4. Lower Bound 

In this section we shall show the lower bound of Theorem [1] We show 
the following result. 

Lemma 9. Fix some number e > 0. Assume that for n large enough, 

(1 + e) log n < p{ 1 — p)n. 

If the initially infected set Iq has size 

n n /log d log log log d ( log log log d \ 

2 2V d a/ d log d V Vd log d ) 

then, for A < 0, with high probability, the MB[n,p ;m) process does not 
percolate. 

Remark 10. Note that Lemma [9] and Corollary [8] prove Theorem [TJ 

In fact to prove Lemma [9j as might be expected, we shall show that, 
with high probability, the MB(n,p ;m) process terminates with I (the set 
of eventual infected vertices) only slightly larger than | Iq |= to. We shall 
do this by bounding the expected number of sets of some size that could be 
the first vertices to be infected. 


Proof. We say that a set of vertices T percolates if all of its vertices will be 
infected eventually. For T C I\/o we can order the vertices of Jo U T by 
the time they get infected. That is, take any order of T such that a vertex 
from Ij is infected before any vertex from ly if j < f. Notice that for each 
v E T the majority of its neighbours (in the whole graph) are in the set of 
its predecessors in this order. Our strategy will be to show that if A < 0 
then, with high probability, there is no percolating set T of a particular size 
and thus the MB(n,p ; to) process does not percolate. 

Set t = |T|, and denote by E = E(T) the edge set of T. Write 
c2e(z) for the degree within T of a vertex i E T. We want to bound the 
probability that T percolates. To do so, we modify the infection rule within 
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T so that the vertices inside T consider their neighbours in T to be already 
infected, regardless of their real state at any particular time step. The latter 
assumption only increases the probability and, more importantly, makes the 
events for vertices in T to be infected independent. This is because these 
events now only depend on how many edges each vertex has to Iq and 
V{G)/ (Jo U T). Conditioning on E. and then taking the expectation give 

t 

P(T percolates) < ^ P (E) P+ Te(*) > B(n — m — t,p)) (11) 

E i= 1 

(for independent random variables B(m,p ) and B(n — m — t,p )). Denote 

g(x) = P (B(m,p) + x > B(n — m — t,p )). Due to the log-concavity of g 
(Corollary fliil) we have for integers x,y, that 


g(x) < g{y) 


( g(y +l) 
V g(v) 


x-y 


Using the latter inequality with x = Te(z) and y = \pt \, we can bound (fTTl) 
by 


5>(i?)ru(s/) 

E i =1 

= Y j nE)g(y) t 

E 


( g{y + l) \ dE ^- y 

V 9(y) ) 

( g{y + 

V g(y) ) 



y'(l-p)( 2 ) ^{yf 


g{y + 1) 
g{y) 


2 j-ty 


^1 ~ p + p 


( g(y + 1) 
V g(y) 



g(y) V* 
g(y + 1)/ 


g(yY- 


Substituting = 1 + a and the elementary inequality 1/(1 +a) < 

1 — a + a 2 , we bound the latter expression by 

(l-p + p(l + a) 2 ) ^ (l - a + a 2 ) ty g(yf 

< exp ^(2 ap + c^p) 1 — + (a 2 - a)pi 2 ^j g{y) t 

= (ex P (^) 9W )‘. (12) 

We have by definition that g(y) is equal to 


g{y) =F(X 1 +X 2 > Pi + p 2 +pn- 2 pm - pt- \pt 1 ), 

where X\ = B(m,p ) with mean p\ and X 2 = B(n — m — t, (1 — p)) with 
mean p 2 . Setting t = [n(log log d) x /\/d log d\ and using Proposition [22] with 
N = S = n ~ 2 ” 7 ~ f and h = p{n — 2m — t) — y to bound g(y), we obtain 
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(for n large enough) 


I \ „ aM 1 -p)(n-t) 
g(y) < — ———tttt— 7 ex P 


< 


< 


pn — 2pm — 2pt — 1 
logd 


(pn — 2 pm — 2 pt — 1)^ 
2p(l — p)(n — t ) 


+ 3 + o(l) 


\/l°g d 


exp 


+ 2 A log log log d + 0 ((log log d )' 


e 4 (log log d) 
\[d log d 


2A 


when A < 0. 

We can also bound g(y) from below by Proposition [2U 


y/p(\ -p)(n -t) 
g(y ) > 771777-7777-7777 ex P 


2tt (pn — 2 pm — 2pt) 


(pn — 2 pm — 2pty 
2p(l — p)(n — t) 


-4-o(l) 


> 


> 


2-7re 4 -v/log d 
((log log d) 2X 
V e 6 ydlogd 


exp 


logd 


+ 2 A log log log d + 0 ((log log d) } 


(13) 


when A < 0 (and n is large enough). 

By definition of g we have that 

g(y + 1) = g(y) + F(B(m,p) +y + 1 = B(n -m- t,p )). 

Let us write z = P (B(m,p) + y + 1 = B(n — m — t,p)) for convenience. 
We shall now obtain an upper bound for z. Using Proposition [23] with 
T = — N = n ~ 2 +T and S = N — m, we obtain 


z < 


+ 


§ - m - 2t 


2 \2 


2n(l-p)(%-2t-i) 


exp 


vr p(% -m-2t-i) 


< 


+ 


\/log d 


27r(l — p)y/d 

6 Vd 


exp 


exp 


logd 


2p(f - m -2t-j) 
(1 -p)(n-t) 
9p(f — m — 2t — |) 2N * 
"8(l-p)(f-2t-i) y 

+ 2A log log log d + o(l) 


+ °( 1 ) 


npny/log d 


exp 


91ogd 9A log log log d 
_l_ + —«_^_ +o(1) 


The first term is much larger than the second, and so we obtain (for 
n large enough) the inequality 


z < 


V / Iogd(log log d) 2X 

7r(l — p)d 


(14) 


We have that a = , and so from lfT3l) and (fHl) (for n large enough) 


e 6 log d 


< 


e 5 log d 


7r(l — p)y/d (1 — p)Vd 


a < 
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We can now bound the expression in (1121) (for n large enough) by 
'3pe 10 (logd) 2 n(log log cf) A \ e 4 (loglogd) 2Ax 1 


P(T percolates) < exp 


< 


2(1 — p) 2 dy/d\ogd 
e 5 (loglogd) 2AA t 


\fd log d 


\fd log d 

(where the second inequality follows since the exponent in the exponential 
is o(l)). 

The expected number of sets of size t that percolates is (for n large 
enough) 


n — m 
t 


e 6 n(loglogd) 2Ax * 


t\fd logd 


n(log log d)^ 


V d log d 


, and so the expected 


because (”) < (^p) . We chose t = 

number of sets of size t that percolates is bounded above by 

(e 6 (log log d) x y = o(l). 

Therefore, with high probability, percolation does not occur for A < 0. □ 


5. Inequalities 

We begin this section with some remarks on the log-concavity of the 
distribution function of the Binomial distribution. These results are stan¬ 
dard, see for example m, but we prove them for completeness. 

Proposition 11. The sum of independent Bernoulli random variables is 
log-concave, that is if X* are independent Bernoulli random variables with 
means pi , then for any k we have, 

n n n 

P(]T Xi = k- i)p (£Xi = k + 1) < (P(^T Xi = k)f. 

2=1 2=1 2=1 

Proof. We proceed by induction on n, with the base case n = 1 being trivial 
as one of the terms on the left hand side of the inequality is zero. Otherwise 
conditioning on X n+ i, and writing f n ^ = P(X^=i Xi = k) we get, 


fn+l,k— l/n+l,fe+l {Pn+lfn,k— 2T(1 Pn+l) fn,k— l) ijPn+1 /n,fcT(l Pn+l)/n,A;+l) 

— {Pn+lfn,k— lT(l Pn+l) fn,k) 

= (/n+l,fc) 2 

The inequality follows as f n ,k-2fn,k+i < fn,k-ifn,k is implied by the 
induction hypothesis. □ 

Proposition 12. The cumulative distribution of a discrete non-negative 
log-concave random variable X is log-concave, that is for all k, 

P(X < k - 1)P(X < k + 1) < (P(X < k)f. 
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Proof. Setting 77 = P( A = i) we get by Proposition [FQ 

(r 0 + ... + r fc _i)r fc+ i < (n + ... + r k )r k + r k r 0 , 


and so, 


Oo + • ■ ■ + r fc _i)(r 0 + ... + r k+1 ) < (r 0 + ... + r k ) 2 . 


□ 


When X is the sum of n independent Bernoulli random variables, we 
can rewrite X = n — Y, where Y is also the sum of n independent Bernoulli 
random variables, and so Proposition 1121 is still true if we replace <, with 
<, > or >. 

Corollary 13. The cumulative distribution of the sum or difference of in¬ 
dependent binomial random variables is log-concave. 

Proof. Sums and differences of independent binomial random variables are 
also sums of independent Bernoulli random variables plus a constant, and 
so are log-concave. □ 


A substantial part of this section is now taken up with providing tight 
bounds, up to a constant factor, on binomial probabilities and their sums. 

Proposition 14. Suppose pn > 1 and k = pn + h < n, where h > 0. Set 


/ 3 = -^7 + 


1 


12k 12 (n-ky 


then P (B(n,p) = k ) is at least 
1 / h 2 


exp 


h 3 


h 4 


-X-B 


2p(l—p)n 2(1— p) 2 n 2 3 p 3 n 3 2 pn 


y/2irp(l-p)n 

Proof. This is Theorem 1.5 in [5], p. 12. □ 

Corollary 15. Suppose p( 1 — p)n = u(n) and k = pn + h, where 0 < h = 
o ^(p(l — p)n)3^J , then 


P (B(n,p) = k) > 


y/2-irp(l — p)n 
Proof. For h in this range we have 


exp 


h 2 


2p(l — p)n 


-o(l) . 


h 3 


h 4 h 

+ „ Q Q A X- — 0(1). 


2(1— p) 2 n 2 3 p 3 n 3 2pn 

We also have that k = uj(n) and n — k = u(n), and so the inequality 
follows from Proposition 1141 □ 

Proposition 16. Suppose pn > 1 and k > pn + h, where h{ 1 — p)n > 3. 
Then 


P (B(n,p) = k) < 


1 


: exp 


^/2irp(l-p)n 
Proof. This is Theorem 1.2 of [5], p. 10 


K 2 


h 3 h 

"d— 


2p{l—p)n p 2 n 2 (1 —p)n J 


□ 
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Corollary 17. Suppose p( 1 — p)n = u ;(n) and k > pn + h, where 

2 

1 < h = o((p( 1 —p)n)3), 

then 

P (B(n,p) = k) < exp f— ^ — -h o(l)^) . 

V V 7 7 ^27rp(l-p)n V 2p(l-p)n V 7 

Proof. For h in this range we have 

h 3 h 

p 2 n 2 (1 — p)n 

and so the inequality follows from Proposition fl6l which can be applied as 
h( 1 — p)n = u(n). □ 


2 

Proposition 18. Suppose p{ 1 — p)n = w(n) and 0 < h = o((p( 1 — p)n) 3) , 
then 

POHn.p) >pn + h)< VP ^f )n ex P (- 2p(1 h _ p)n + »(!)) ' 

Proof. This proof follows that of Theorem 1.3 in [5j. For m > pn + h, we 
have 

P(H(n,p) = m + 1) < ^ /i + (l-p) 

P(H(n,p) = m) — (1 — p)(pn + h + 1) 

Hence, 

P(U(n,p) > pn + h) < P (B(n,p) = \pn + K\). 

1 — A 

As (1 — A) -1 < _|_ A) < PP-^p) n , we g e f f r0 m Proposition [T6l 

that 


P (B(n,p) > pn + h) < 


yfp{l-p)n 


GXp 1 1 

hy/2n V 2p(l— p)n ' p(l—p)n ' p 2 n 2 / 

2 

the last two terms in the exponent being o(l), for h = o(p( 1 — p)n) 3. □ 

Proposition 19. Suppose p(\ — p)n = u(n) and 

1 2 
(p(l — p)n )2 <h = o((p( 1 — p)n)z), 


h 2 


-+- 


h 


-+- 


h 3 


f/ien 


F(B(„.p) >pn + h)> (- ^ p) „ - | - »d)) • 

Proof. Due to the unimodality of the binomial distribution, we have that 
the probability density function of the binomial distribution is decreasing 
away from its mean, and so, 

P (B(n,p) >pn + h)> p ^~ p ^ n f>(B(n,p) = pn + h + p ^ l ~ p ^ n ). 

h h 

We can apply Corollary [T5l as h + h Ppl = o((p( 1 — p)n)s), and so 
it follows that 


P (B(n,p) > pn + h) > 


y/p{ 1 ~p)n 

hyf 27T 



( H ^) 2 

2p(l — p)n 
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This is greater than the stated bound because 

(h + P(-*- n ) 2 < h 2 + 3p(l - p)n. 

□ 


We shall also want a weaker but more general bound than Proposi¬ 
tion [18] due to Bernstein in [4]. 

Lemma 20. Let X\..... X n be independent zero-mean random variables. 
Suppose that \Xi\ < M, then for all positive t, 

P (|r ^ > ~ ^ (~2£E(Xj) + §Mt) • 

Proof. For a proof see [7]. □ 


We have in this section, so far discussed well-known deviation inequalities 
for standard binomial distributions. We will now proceed to present some 
analogous results for sums of binomial distributions with different parame¬ 
ters p, that we have not been able to find in the litterature. 


Proposition 21. Suppose that p( 1 — p)N = w(N), the inequality 
2(2p(l — p)N )2 < h = o((p( 1 — p)N) 3 ) 


holds and 

hS = o(N((p(l-p)N)%)). 

For the independent random variables; X\ = B(N — S,p), with mean fi\ and 
variance af,; and X 2 = B(N + S, (1 — p)) with mean p ,2 and variance a\, 
we have 


P(Ab + X 2 > pi + /i2 + h) > 


y/2p(l-p)N 

2 nh 


exp 


h 2 


Ap(l—p)N 


Proof. The conditions on S and h imply that S = o(N). Set 


to ^p) n anc l 


y/2p(l-p)N 


respectively. We can bound 


-4-o(l)). 

z and l equal 


P(A'i + X 2 > /ii + /12 + h) 


from below by summing over the disjoint regions 

/-1 / h k 

"y ) P ( X\ < /ii+—— iz, X 2 < /.i 2 +~+(i+l) 2 ), Ai+A 2 > /ii+/i 2 +^ 

i=-l ^ 

(15) 

These regions are disjoint as if X\ < pi+^—(i+l)z and X 2 < p 2 +^+(i+l-)z, 
then X\ + X 2 < P 1 +P 2 +h. For each i the region specified is an isosceles 
right angled triangle with axis-parallel legs of length z, and so there are at 
least |_ z J(IaI — l)/2 pairs of integer values xi,X 2 , which X\,X 2 can take 
while still satisfying all three relations in (1151) . We have that h > 21 z, and 
so if X\. X‘i satisfy all three relations in (1151) . then X\ > and X 2 > p 2 - 
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As we are only considering the region in which X\. are larger than their 
means we can bound the sum in (1151) from below by 
i -1 


E 

i=-i 




( 

h 


v 1 = 

pi + -~ iz 

II 

. =9 

& 


h 2 + 2 + (i + 1 )z 


(16) 

We have that p(l— p)(N—S) = u(N) and h+lz = o(p(l—p)(N—S))3, 
and so we can apply Corollary [15] to get that the quantity in (flHl) is at least 
l -1 


E 

i =—1 


exp 


L*J(L*]-i) 

47rf7lCT2 

(%-iz+l) 2 (N+S)+(%+(i+l)z+l) 2 (N-S) 


-o(l) 


2p(l - p)(N 2 - S 2 ) 

Expanding this out, and noticing [z\ = z( 1 + o(l)) and 
(N-S)(N + S) = N 2 (1 + o(1)) 
we get that the sum in (fT6l) is at least 
.,2 


/-i 

4irp(l—p)N 

h 2 N+2hzN+4i 2 z 2 N+(4i+2)z 2 N+o(p(l-p)N 2 ) 

exp ' 


-o(l) 


(17) 


Ap(l-p)(N 2 -S 2 ) 

where the approximations for [z\ and < 71,02 have been taken care of in the 
o(l) in the exponential term. We have that 4 i 2 + 4z + 2 < 6 1 2 and l 2 z 2 < 
2p(l — p)N, and so using the bounds in the statement of the proposition, 
the sum in (fT71) is at least 

l-i 9 


E 

i=—l 


4irp(l—p)N 


exp 


lz 2 


2-Kp(l — p)N 


exp 


h? N+16p(l—p)N 2 
' Ap(l-p)(N 2 -S 2 ) 

h?N 


~o(l) 


y/2p(l-p)N 
>-2 ~h - exp 


4p(l -p)(N 2 -S 2 ) 

hr 2 


-4-o(l) 


Ap(l — p)N 


-4-o(l) 


The last inequality following because l > h/(2y/2p(l — p)N) and 
hS = o(N(p(l-p)N)^). □ 


Proposition 22. Suppose that p( 1 — p)N = oj(N). Furthermore assume 
that 

2(2p(l — p)N)z < h = o((p(l — p)N) 5) and Sh = o(N(p(l — p)N)z). 
Then we have 

nx 1 + x 2 >pi + p 2 + h) < ^ 2p ^~ P)N exp ^-—^A__ + 3 + 0 (i)^ , 

for independent random variables X\ = B(N — S,p) with mean p\ and 
variance a 2 , and X 2 = B(N + S, (1 — p)) with mean P 2 and variance a 2 . 
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Proof. The conditions on S and h imply that S = o(N). Set z = 2 Np l 1 p \ 


and l = 4 jVp(i—p) • We bound P(Xi + X 2 > Hi + H 2 + h) from below by 
covering the region where this inequality holds by 

P(-X'i + X2 P /ii + H2 + h) < 


(18) 


y; ( IP ( 0 < < z j P ( 0 < X 2 -H 2 --^-jz < z 


+ - 


X\ > Hi + — + Iz 


+ P ( X 2 > H2 + 7^ + ^ ) • 


(19) 

( 20 ) 

( 21 ) 


We shall bound these three summands separately. Again because 
h > 21 z we are only considering the range in which X\ and X 2 are greater 
than their means. Firstly for each i,j pair there are at most |~z~| 2 points 
inside the specified region, and so the product inside the sum of (fT9l) is at 
most 


/ 

h 


h 

[Xi- 

Hi + -+ iz 

J P ( X 2 = 

H2 + ^+JZ 


We have that p(l — p)(N ±5) = u(N) and 

1 < h ± Iz = o(p( 1 — p)(N ± S)) 3 , 

and so we can apply Corollary [T7] to get that the sum in (fT9j) is at most 


r*i : 


i^-i 2 i: p( 1 “ P)VN 2 - S 2 

(f + iz ) 2 (IV + S) + (| + jzf (N - S) 


■ exp 

This is equal to 
J2 


2p(l — p)(N 2 — S 2 ) 


+ o(l) 


h?N 


2vrp(l - p)N 6XP 4p(l ~ P)(N 2 - S 2 ) + ° (1) J . 

h(i + j)zN + hzS(i — j) + z 2 N(i 2 + j 2 ) + z 2 S(i 2 — j 2 ) 


exp 


2p(l -p)(N 2 - S 2 ) 


( 22 ) 


We can bound the above by noting that | i — j\ < \[2^Ji 2 + j 2 and 
\i 2 ~ j 2 1 < i 2 + j 2 - As we also have that Np(l — p)/2 < z 2 l < Np( 1 — p), 
the inner sum appearing in (1221) is at most 


X] ex P I ~(i + j) + 

i+j>- 1 V 


i 2 + j 2 i 2 + j 


4 1 


M 


+ o(l) • 


(23) 
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A point [i. j) in the plane with integer coordinates and 


i 2 +j 2 

41 


i 2 + j 2 


41 


< t, 


also satisfies |i — j\ < \/21 tl, as if |i — j\ > \/21 tl, then i 2 +j 2 > and so 


i 2 +f 

41 


i 2 + j 2 


41 


> 4-,/yu 


Therefore the number of points (i, j) in the plane with integer coordinates 

and satisfying both - \J i ^ < t, and — 1 < % + j < t is at most 

2(t + 1)\/21 It. This allows us crudely bound (1231) by 

OO 

2V211 + l)Vt exp (— (t — 1 )) . 

t =1 


The latter sum is less than 50 \Jl, and so the sum in (1191) is bounded 
above by 

h 2 \ 

+ o(l) . (24) 


50y / p(l — p)N 


hn 


exp 


4p(\ — p)N 


Secondly we bound the probability (1201) . As l > 


that 


h . 2 

8Np(l—p) 


we have 


P ( X\ > pi + — + Iz ) < P ( X\ > pi + 


3 h 


By Proposition [IS] we get that the quantity in (HZUD is at most 

9 h 2 


4y/p(l - p)(N - S) 

■ih\/ 2 TT 


exp 


2 y/2p(l — p)N 
< „ -;-exp 


3-v/r 


h 


32p(l -p)(N- S) 
h 2 


+ o(l) 


4p( 1 — p)N 


+ o( 1) 


Similarly, the probability in (1211) is at most 


2 y/2p(l-p)N 


3^/tt h ^ V 4p( 1 — p)N 

< e 3 we get that tin 
is at most the stated bound 


h 2 


and 


+ o(l) 


As -42- + + 7 = < e 3 we get that the sum of our three bounds, 

V27T Oy/7T 0 


(25) 


(26) 


□ 


Proposition 23. Suppose that p(\ — p)N = oj(N), that 

uj(N)(p( 1 — p)N )2 <pS = o((p( 1 — p)N)3) 
and that T = o(N), then, for N large enough, 

p(Zl = + pT) < 2„(i -p)n exp (- (i-rtpjy-D + ° (1) ) 

3 ( 9pS 2 \ 

+ +S eXP (“8(l-p)Jvj' 

/or independent random variables Z\ = B(N — 5,p) with mean ji\ and vari¬ 
ance o\ and Z 2 = B(N + S — T,p) with mean P 2 and variance a 2 . 
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Proof. Let <f(i) be the probability that Z\ = Z 2 + pT = pN + i, then 


(f(i) 


( N ~ S \( N + S - T \ v p(2N-T)+2i (1 _ \ (l—p)(2N—T)—2i 

\pN + i)\pN-pT + i) 1 v 11 


Denote the ratio between successive values of cf{i) by if(i). We obtain 

_ cf(i + l) _ p 2 ((l-p)N-S-i)((l-p)(N-T) + S-i) 
(l-p) 2 (pN + i + l)(p(N-T) + i + l) ‘ 


Hence, we get 


if{i) 


1 - 


S+i 

(1 -P)N 


1 + 


S-i 

(1 -p)(N-T) 



1 + 


i +1 A 

p(N-T)) 


(27) 


2 

and so if is a decreasing function of i. By noting that e x ~ x < (1 + x) < e x , 

for x > — we can bound if for i = o(p(l — p)N) (when N is large enough). 

~ 2 _ 

We apply e x ~ x < (1 + x) for the terms in the numerator of (1271) and < 

(1 + x) < e x for the terms in the denominator of (1271) to get the following 
lower bound of if 


exp 


f pST — (2N — T)(i + 1 — p) 

( S + i V | 

f s-i 

1 p(l-p)N(N-T) 1 

Ul -P)NJ 1 

Ul -p)(N-T)J i 


and we apply < (1 + x) < e x for the terms in the numerator of (1271) and 
e x ~ x < (1 + x) for the terms in the denominator of ()27[) to get the following 
upper bound of if 

( pST-(2JV-T)(«+l-rt f i+1 

p(l-p)N(N-T) ^ \ pN j ' \p(N — T)j j 


and 


Substituting in i = ±^r, we get (for N large enough) that 


^(^■) < ex P 


< exp 

< 1 - 


/ / (2iV — 3T)S \ 

V \2{1-p)N(N-T)J 
f (2 N - 3 T)S \ 

V 4(1 -p)N{N-T) J 

S 

3(1 — p)N 


(i+o(i)d 


V’L^) > 


> exp 

> 1 + 


// (2 N + T)S \ 

VV2(1 -p)N(N-T)J 
f (2 N + t)S \ 

V4(l -p)N(N-T) J 
S 

3(1 — p)N' 


(l + o(l))) 


(28) 


(29) 


Therefore if is greater than 1 at i = pN — +- and less than 1 at i = pN + ^. 
Consequently (for N large enough), the maximum value of 4> occurs between 
these two values. 
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We have that 

<f>{i) = P(Zi = fii + pS + i)F(Z 2 = p 2 +pS — i), 

where 

Z' 2 = N + S-T-Z 2 = B(N + S-T,( 1 - p)), 
with mean fi' 2 and variance (a' 2 ) 2 . By Corollary [TT] we get that 


(t>{i) < 


1 


2itaicr. 


7 ex P 


1 


2irp(l — p)N 

1 


exp 


27rp(l — p)N 


exp 


( P S + i) 2 (N + S -T) + (pS - i) 2 (N - S ) 
2p(l-p)(N-S)(N + S-T) 

when i = pS 2N r ^ S and the 

P S 2 {(2N - T) 2 - (T - 2S) 2 ) 

2(1 - p)(N - S)(N + S - T)(2N - T ) 

2 pS 2 


+ °(1) ) > 


for |i| < This is maximized when i = pS 2N ^fp and there takes the value 


+ o(l) 


(1 — p)(2N — T) 


+ o(l) • 


We also obtain the bounds (for N large enough) 


, ,pS. 1 

n 2 J 2p(l — P )ttN 


< 


1 


and 


2p(l — p)irN 


1 


exp 


exp 


pS 2 (10JV + 8S-9T) 

'8(1 - p)(N - S)(N + S - T) 

9 pS 2 


+ o(l) 


—pS 

^~2 ~^ < 2p(l-p)nN 


< 


1 


2p(l — p)ttN 


exp 


exp 


8(1 — p)N J 

pS 2 (10N -8S -T) 

”8(1 - p)(N - S)(N + S - T) 

9 pS 2 


+ o(l) 


8(1 — p)N J ' 


Putting this all together and applying 
N large enough) 


and 


P(Zi = Z 2 = pT ) < pS max <f>(i) + 

i 

s 


i 


we obtain (for 


pS V’(Hn) ,,-pS, 

<nw) + „_„C' —7 <K- 


i-^(4f K 2 ' 1 iK^)-i rv 2 


< 


+ 


27t(1 — p)N 


exp 


2 pS 2 


(1-P)(2 N-T) 


2 

+ o(l) 


pirS 


exp 


9 pS 2 


8(1 — p)N 


□ 


We end with some propositions about the number of edges in and 
between sets in G(n,p). 

Proposition 24. Suppose thatp(l — p)n = oj(n). If n is large enough, then 
for all t > jr, we have that with probability at least 1 — every set in 
G(n,p) of size t has at most p(^) + 2ty/p(l — p)t edges. 
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Proof. The expected number of sets of size t with more than 


edges is 


n 


P B 


pQ) +2ty/p{l-p)t 

,pj >pQ) + 2ty/p(l - p)tj . 


By Lemma [201 and the fact that (") < )*, this expectation is at 


most 


(5 e) 1 exp — 


4p(l — p)t 3 


2pa -p)a)+ 1! # 3 . 


AS y/p(l-p)t = uj(n), we have that if n is large enough, then for all 


t > jr we have 


2p(l -p) 


A 4*^(1-P)^ i 001p( i- p)t , 


Substituting this in we have that the expected number of sets of size 
t with more than pQ) + 2ty / p(l — p)t edges is (for n large enough) at most, 

4p(l — p)t 3 


exp t(log 5 + 1) — 


1.001p(l -p)t 2 


< 4 


— t 


□ 


Proposition 25. Suppose that p{ 1 — p)n = oj(n). If n is large enough then 
for all t in the range ^ < t < we have that with probability at least 1 —4 -t 
every set in G(n,p) of size t has at least pt(n — t) — 3ty/p(l — p)[n — t ) edges 
between it and its complement. 

Proof. The expected number of sets T of size t with less than pt(n — t) — 
3t^/p(l — p)[n — t ) edges between T and [n] \ T is 

( B(t(n-t ), (1 -p)) > (1 — p)t(n -t) + 3ty/p(l - p){n - t)j . 

By Lemma [201 and the fact that (") < r) 4 , this expectation is at 

9p(l — p)t 2 (n — t) \ 


most 


(5 ef exp — 


2p(l — p)t{n — t) + 2i \J p( 1 — p)(n — t) 


As y/(n — t)p( 1 — p) = u(n), we have that if n is large enough, then 


for all t in the range ? < t < 


2 > 


9 


2p(l — p)t(n — t) + 2fi/p(l — p)(n — t ) < -p(l — p)t{n — t). 

Substituting this in we have that the expected number of sets T with 
a small number of edges between T and [n] \ T is (for n large enough) less 
than 

exp (t(log 5 + 1) — At) < 4 _t . 

□ 
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Proposition 26. Suppose that p(l — p)n > 41ogn. If n is large enough, 
then for all t < ^ we have that with probability at least 1 — to” 120 ; for every 
set T in G(n,p) of size t there are at least twice as many edges between T 
and [n] \T as there are in T. 


Proof. The expected number of sets T of size t such that there are less than 
twice as many edges between T and [n] \ T as there are in T is 

< 2B (G) ,p )) ’ 

for independent random variables B ( t(n — t),p ) and B (Q),^). 

We can rewrite this as, 


n 


P 2 B 


,p — pt(t— 1) — B(t{n—t),p) +pt(n—t) > pt(n— 2t+l) . 


By Lemma l20l this is at most 


n 


exp 


(pt(n — 2 1 + !))■ 


2p(l — p)t(n + t — 2) + 


4pt(n—2t— 1) 


(30) 


For t < using the inequality (”) < n* we have that the quantity 
in (13(1 is (for n large enough) less than 


n exp 


pt( 


lln\2 
12 > 


10 n 
3 


< n exp — 


4tlogn 121n\ 


n 


480 ) 


= n 120. 


For < t < using the inequality (") < we have that the 


quantity in (15CT1) is (for n large enough) less than, 



pt(ff 


10n 

3 


< 


2 4 ) 

n s J 


< n 


t 

' 120 


□ 


Corollary 27. Suppose that pn > logn. If n is large enough, then for all 

24 —t 

t satisfying 71,25 < t < we have that with probability at least 1 — 77120, 
for every set T in G(n,p ) of size t, there are at least twice as many edges 
between T and [77] \ T than there are in T. 


Proof. By the exact same reasoning as in Proposition [26] the expected num¬ 
ber of sets T of size t with less than twice as many edges between T and 
[ti] \ T than there are in T is (for n large enough) at most 




< 



_ t_ 

< n 120 . 


Proposition 28. For every fixed e > 0 and p > ^ 1+£ ^ logn 
bility, the minimal degree of G(n,p) is greater than 8. 


□ 

with high proba- 
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Proof. The expected number of vertices with degree at most 8 is bounded 
by 


nP(B(n — 1 ,p) < 8) = 


»=o 


n — 1 


P*(l ~p) 


n—l—i 


<n | ( n _ 1 ip' 




+ ... 


<^-p 8 a-pr- 9 - 


(31) 


These inequalities follow as maxj<g P(B(n — 1 ,p) = i) occurs (for n 
large enough) when i = 8, and so P (B(n — l,p) < 8) < 9P {B(n — 1 ,p) = 8). 
The last line of (1311) is maximised over 0 < p < 1 when | = that is 
when p = —So for p in our range, (1311) is maximised when p = ^ 1+e ] t Iogri . 
Therefore (for n large enough) 


riP(B(n 


1,P) <8) < 


9n 9 (l + e) 8 (logn) 8 
8!n 8 


(n —9)(l + e) log n 

n 


(logn) 8 

~K. 

712 


□ 


Proposition 29. Suppose that (1 + e)logn < pn < 5logn. If n is large 

29 

enough, then for all t satisfying t < n so, we have that with probability at 
least 1 — ri~ 120 , every set in G(n,p) of size t has at most 2 1 edges. 

Proof. The expected number of sets T in G(n,p) of size t with at least 2t 
edges is 



By carefully bounding the summands in (1321) for i = 2t and i = 2t + 1 , 
we shall get a good bound on the total sum. We have that 

(®y(i-p)o-”< (^) 2, < (^r 

We also get 

( 2 Si ) 1 )y M+1 ( 1 -?) ( ^ 2f ~ 1 PiQ-it) ^ i 

{®yt(i- P p- 2t (i-p)(2t+i) ~ p 2- 


Because the ratio between consecutive terms in the sum in (|32l) de¬ 
creases as i increases, we have from above that the total sum is at most 
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twice the first term, therefore 

V 


B 


,p)>2t < 


n 
t 

„3 tr,rt 


bet log n 


4 n 


2 1 


< 2 


e 3 *25*(log?i) 2 *t* 


< 


f C(logn 


16 * 71 * 

2 n t 




n 30 


and so the expected number of set T in G(n,p) of size t with at least 2t 
edges is (for n large enough) at most n~ 120 . □ 

Acknowledgement. We would like to express our gratitude to B. Bol¬ 
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